Picture for Haochen Wang

Haochen Wang

FastMap: Revisiting Dense and Scalable Structure from Motion

Add code
May 07, 2025
Viaarxiv icon

Integrating Learning-Based Manipulation and Physics-Based Locomotion for Whole-Body Badminton Robot Control

Add code
Apr 24, 2025
Viaarxiv icon

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Add code
Apr 14, 2025
Viaarxiv icon

Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Add code
Apr 02, 2025
Viaarxiv icon

DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question Answering

Add code
Mar 20, 2025
Viaarxiv icon

OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction

Add code
Oct 30, 2024
Viaarxiv icon

Reconstructive Visual Instruction Tuning

Add code
Oct 12, 2024
Viaarxiv icon

CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data

Add code
Sep 25, 2024
Figure 1 for CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data
Figure 2 for CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data
Figure 3 for CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data
Figure 4 for CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data
Viaarxiv icon

DocTabQA: Answering Questions from Long Documents Using Tables

Add code
Aug 21, 2024
Viaarxiv icon

VISA: Reasoning Video Object Segmentation via Large Language Models

Add code
Jul 16, 2024
Figure 1 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 2 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 3 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 4 for VISA: Reasoning Video Object Segmentation via Large Language Models
Viaarxiv icon